They are Not Completely Useless: Towards Recycling Transferable Unlabeled Data for Class-Mismatched Semi-Supervised Learning

نویسندگان

چکیده

Semi-Supervised Learning (SSL) with mismatched classes deals the problem that classes-of-interests in limited labeled data are only a subset of massive unlabeled data. As result, classical SSL methods would be misled by which possessed To solve this problem, some recent divide to useful in-distribution (ID) and harmful out-of-distribution (OOD) data, among latter should particularly weakened. potential value contained OOD is largely overlooked. remedy defect, paper proposes “Transferable Recycling” (TOOR) method properly utilizes ID as well “recyclable” enrich information for conducting class-mismatched SSL. Specifically, TOOR treats have close relationship recyclable, employs adversarial domain adaptation project them space In other words, recyclability an datum evaluated its transferability, recyclable transferred so they compatible distribution known classes-of-interests. Consequently, our extracts more from than existing methods, it achieves improved performance demonstrated experiments on typical benchmark datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimate Unlabeled-Data-Distribution for Semi-supervised PU Learning

Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class beyond the labeled data categories. This problem has been widely studied in recent year...

متن کامل

Semi-supervised Learning with Weakly-Related Unlabeled Data: Towards Better Text Categorization

The cluster assumption is exploited by most semi-supervised learning (SSL) methods. However, if the unlabeled data is merely weakly related to the target classes, it becomes questionable whether driving the decision boundary to the low density regions of the unlabeled data will help the classification. In such case, the cluster assumption may not be valid; and consequently how to leverage this ...

متن کامل

Using Unlabeled Data for Supervised Learning

Many classification problems have the property that the only costly part of obtaining examples is the class label. This paper suggests a simple method for using distribution information contained in unlabeled examples to augment labeled examples in a supervised training framework. Empirical tests show that the technique described in this paper can significantly improve the accuracy of a supervi...

متن کامل

Transferable Semi-supervised Semantic Segmentation

The performance of deep learning based semantic segmentation models heavily depends on sufficient data with careful annotations. However, even the largest public datasets only provide samples with pixel-level annotations for rather limited semantic categories. Such data scarcity critically limits scalability and applicability of semantic segmentation models in real applications. In this paper, ...

متن کامل

Semi-Supervised Support Vector Machines for Unlabeled Data Classification

A concave minimization approach is proposed for classifying unlabeled data based on the following ideas: (i) A small representative percentage (5% to 10%) of the unlabeled data is chosen by a clustering algorithm and given to an expert or oracle to label. (ii) A linear support vector machine is trained using the small labeled sample while simultaneously assigning the remaining bulk of the unlab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2023

ISSN: ['1520-9210', '1941-0077']

DOI: https://doi.org/10.1109/tmm.2022.3179895